Overview

Dataset statistics

Number of variables16
Number of observations135617
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.6 MiB
Average record size in memory128.0 B

Variable types

NUM11
BOOL5

Reproduction

Analysis started2021-05-24 04:29:36.294771
Analysis finished2021-05-24 04:30:29.121268
Duration52.83 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

add_group is highly correlated with add_friendHigh correlation
add_friend is highly correlated with add_groupHigh correlation
finish_num is highly correlated with learn_numHigh correlation
learn_num is highly correlated with finish_numHigh correlation
coupon is highly skewed (γ1 = 58.57520335) Skewed
user_id has unique values Unique
login_diff_time has 12476 (9.2%) zeros Zeros
distance_day has 1438 (1.1%) zeros Zeros
login_time has 7932 (5.8%) zeros Zeros
launch_time has 87751 (64.7%) zeros Zeros
camp_num has 9741 (7.2%) zeros Zeros
learn_num has 27334 (20.2%) zeros Zeros
finish_num has 45739 (33.7%) zeros Zeros
coupon has 122104 (90.0%) zeros Zeros
course_order_num has 127009 (93.7%) zeros Zeros

Variables

user_id
Real number (ℝ≥0)

UNIQUE

Distinct count135617
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2000002535200329.5
Minimum2000001555945280
Maximum2000002948014779
Zeros0
Zeros (%)0.0%
Memory size1.0 MiB
2021-05-24T12:30:29.281332image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2.000001556e+15
5-th percentile2.000002282e+15
Q12.000002415e+15
median2.000002499e+15
Q32.000002718e+15
95-th percentile2.000002919e+15
Maximum2.000002948e+15
Range1392069499
Interquartile range (IQR)303046960

Descriptive statistics

Standard deviation249996404
Coefficient of variation (CV)1.249980436e-07
Kurtosis3.719889271
Mean2.000002535e+15
Median Absolute Deviation (MAD)114796193
Skewness-1.247888558
Sum-5.466817289e+18
Variance6.249820203e+16
2021-05-24T12:30:29.414970image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2.000002763e+151< 0.1%
 
2.000002458e+151< 0.1%
 
2.000002739e+151< 0.1%
 
2.000002421e+151< 0.1%
 
2.000001659e+151< 0.1%
 
2.00000244e+151< 0.1%
 
2.000002356e+151< 0.1%
 
2.000002829e+151< 0.1%
 
2.000002421e+151< 0.1%
 
2.000002753e+151< 0.1%
 
Other values (135607)135607> 99.9%
 
ValueCountFrequency (%) 
2.000001556e+151< 0.1%
 
2.000001557e+151< 0.1%
 
2.000001558e+151< 0.1%
 
2.000001558e+151< 0.1%
 
2.000001558e+151< 0.1%
 
ValueCountFrequency (%) 
2.000002948e+151< 0.1%
 
2.000002947e+151< 0.1%
 
2.000002947e+151< 0.1%
 
2.000002947e+151< 0.1%
 
2.000002947e+151< 0.1%
 

login_day
Real number (ℝ)

Distinct count50
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.183258735999174
Minimum-1
Maximum108
Zeros0
Zeros (%)0.0%
Memory size1.0 MiB
2021-05-24T12:30:29.560270image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1
Q13
median4
Q36
95-th percentile8
Maximum108
Range109
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.363427555
Coefficient of variation (CV)0.5649728368
Kurtosis125.8400782
Mean4.183258736
Median Absolute Deviation (MAD)2
Skewness3.611565193
Sum567321
Variance5.585789808
2021-05-24T12:30:29.685366image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
52122915.7%
 
42003514.8%
 
61986514.6%
 
31893814.0%
 
21706912.6%
 
71480510.9%
 
1124769.2%
 
862124.6%
 
-142003.1%
 
94810.4%
 
Other values (40)3070.2%
 
ValueCountFrequency (%) 
-142003.1%
 
1124769.2%
 
21706912.6%
 
31893814.0%
 
42003514.8%
 
ValueCountFrequency (%) 
1081< 0.1%
 
1021< 0.1%
 
1011< 0.1%
 
911< 0.1%
 
841< 0.1%
 

login_diff_time
Real number (ℝ)

ZEROS

Distinct count662
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0862624892159536
Minimum-1.0
Maximum135.0
Zeros12476
Zeros (%)9.2%
Memory size1.0 MiB
2021-05-24T12:30:29.821316image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile0
Q10.75
median1
Q31.2
95-th percentile2
Maximum135
Range136
Interquartile range (IQR)0.45

Descriptive statistics

Standard deviation1.933017576
Coefficient of variation (CV)1.779512407
Kurtosis524.840013
Mean1.086262489
Median Absolute Deviation (MAD)0.25
Skewness17.49091975
Sum147315.66
Variance3.736556951
2021-05-24T12:30:29.960190image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12982322.0%
 
0124769.2%
 
0.576045.6%
 
0.8369665.1%
 
0.868635.1%
 
0.8666874.9%
 
1.1764004.7%
 
0.7561544.5%
 
0.6761114.5%
 
0.8856254.1%
 
Other values (652)4090830.2%
 
ValueCountFrequency (%) 
-142003.1%
 
0124769.2%
 
0.576045.6%
 
0.6761114.5%
 
0.7561544.5%
 
ValueCountFrequency (%) 
1351< 0.1%
 
85.331< 0.1%
 
781< 0.1%
 
74.81< 0.1%
 
71.21< 0.1%
 

distance_day
Real number (ℝ)

ZEROS

Distinct count353
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean136.36451919744576
Minimum-1275
Maximum6588
Zeros1438
Zeros (%)1.1%
Memory size1.0 MiB
2021-05-24T12:30:30.100077image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-1275
5-th percentile11
Q138
median84
Q3180
95-th percentile377
Maximum6588
Range7863
Interquartile range (IQR)142

Descriptive statistics

Standard deviation135.5882317
Coefficient of variation (CV)0.9943072622
Kurtosis85.20759251
Mean136.3645192
Median Absolute Deviation (MAD)57
Skewness3.004213569
Sum18493347
Variance18384.16859
2021-05-24T12:30:30.225174image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
-142283.1%
 
2128072.1%
 
2025991.9%
 
37424911.8%
 
2222841.7%
 
37321811.6%
 
37921221.6%
 
4320071.5%
 
37519991.5%
 
4419951.5%
 
Other values (343)11090481.8%
 
ValueCountFrequency (%) 
-12751< 0.1%
 
-9811< 0.1%
 
-231< 0.1%
 
-151< 0.1%
 
-143< 0.1%
 
ValueCountFrequency (%) 
65881< 0.1%
 
64821< 0.1%
 
43931< 0.1%
 
30061< 0.1%
 
26651< 0.1%
 

login_time
Real number (ℝ≥0)

ZEROS

Distinct count644
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.096684044035776
Minimum0
Maximum1480
Zeros7932
Zeros (%)5.8%
Memory size1.0 MiB
2021-05-24T12:30:30.378415image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17
median21
Q344
95-th percentile140
Maximum1480
Range1480
Interquartile range (IQR)37

Descriptive statistics

Standard deviation57.63938892
Coefficient of variation (CV)1.512976532
Kurtosis37.29703934
Mean38.09668404
Median Absolute Deviation (MAD)17
Skewness4.58075032
Sum5166558
Variance3322.299156
2021-05-24T12:30:30.539981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
185396.3%
 
079325.8%
 
253874.0%
 
334122.5%
 
426532.0%
 
1426452.0%
 
1625611.9%
 
1525561.9%
 
1325481.9%
 
1225351.9%
 
Other values (634)9484969.9%
 
ValueCountFrequency (%) 
079325.8%
 
185396.3%
 
253874.0%
 
334122.5%
 
426532.0%
 
ValueCountFrequency (%) 
14801< 0.1%
 
13391< 0.1%
 
11661< 0.1%
 
11561< 0.1%
 
11071< 0.1%
 

launch_time
Real number (ℝ≥0)

ZEROS

Distinct count23
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5111158630555166
Minimum0
Maximum76
Zeros87751
Zeros (%)64.7%
Memory size1.0 MiB
2021-05-24T12:30:30.700397image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum76
Range76
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8905220551
Coefficient of variation (CV)1.742309561
Kurtosis485.042345
Mean0.5111158631
Median Absolute Deviation (MAD)0
Skewness8.761561022
Sum69316
Variance0.7930295307
2021-05-24T12:30:30.847315image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
08775164.7%
 
13299424.3%
 
2104517.7%
 
331452.3%
 
49020.7%
 
52470.2%
 
6720.1%
 
723< 0.1%
 
89< 0.1%
 
164< 0.1%
 
Other values (13)19< 0.1%
 
ValueCountFrequency (%) 
08775164.7%
 
13299424.3%
 
2104517.7%
 
331452.3%
 
49020.7%
 
ValueCountFrequency (%) 
761< 0.1%
 
471< 0.1%
 
371< 0.1%
 
272< 0.1%
 
251< 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
0
94045
1
41572
ValueCountFrequency (%) 
09404569.3%
 
14157230.7%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
0
125626
1
 
9991
ValueCountFrequency (%) 
012562692.6%
 
199917.4%
 

add_friend
Boolean

HIGH CORRELATION

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
1
135099
0
 
518
ValueCountFrequency (%) 
113509999.6%
 
05180.4%
 

add_group
Boolean

HIGH CORRELATION

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
1
135099
0
 
518
ValueCountFrequency (%) 
113509999.6%
 
05180.4%
 

camp_num
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6076008170067175
Minimum0
Maximum6
Zeros9741
Zeros (%)7.2%
Memory size1.0 MiB
2021-05-24T12:30:31.000367image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9602472176
Coefficient of variation (CV)0.5973169505
Kurtosis1.246628902
Mean1.607600817
Median Absolute Deviation (MAD)1
Skewness0.9132631033
Sum218018
Variance0.9220747189
2021-05-24T12:30:31.130081image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
16106145.0%
 
24511433.3%
 
31370310.1%
 
097417.2%
 
444053.2%
 
515581.1%
 
635< 0.1%
 
ValueCountFrequency (%) 
097417.2%
 
16106145.0%
 
24511433.3%
 
31370310.1%
 
444053.2%
 
ValueCountFrequency (%) 
635< 0.1%
 
515581.1%
 
444053.2%
 
31370310.1%
 
24511433.3%
 

learn_num
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count26
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.3129548655404557
Minimum0
Maximum25
Zeros27334
Zeros (%)20.2%
Memory size1.0 MiB
2021-05-24T12:30:31.271158image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile9
Maximum25
Range25
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.966820872
Coefficient of variation (CV)0.8955210657
Kurtosis1.730926293
Mean3.312954866
Median Absolute Deviation (MAD)2
Skewness1.129273811
Sum449293
Variance8.802026085
2021-05-24T12:30:31.400174image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
02733420.2%
 
11846313.6%
 
41655012.2%
 
51652212.2%
 
31604111.8%
 
21597911.8%
 
669035.1%
 
748973.6%
 
841553.1%
 
931212.3%
 
Other values (16)56524.2%
 
ValueCountFrequency (%) 
02733420.2%
 
11846313.6%
 
21597911.8%
 
31604111.8%
 
41655012.2%
 
ValueCountFrequency (%) 
251< 0.1%
 
242< 0.1%
 
235< 0.1%
 
227< 0.1%
 
217< 0.1%
 

finish_num
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct count26
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.692140365883333
Minimum0
Maximum25
Zeros45739
Zeros (%)33.7%
Memory size1.0 MiB
2021-05-24T12:30:31.529977image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q35
95-th percentile8
Maximum25
Range25
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.886858906
Coefficient of variation (CV)1.072328524
Kurtosis2.110070635
Mean2.692140366
Median Absolute Deviation (MAD)2
Skewness1.273291544
Sum365100
Variance8.333954345
2021-05-24T12:30:31.661892image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
04573933.7%
 
51664512.3%
 
11544611.4%
 
31356510.0%
 
2132779.8%
 
4126379.3%
 
655124.1%
 
734642.6%
 
830942.3%
 
1022001.6%
 
Other values (16)40383.0%
 
ValueCountFrequency (%) 
04573933.7%
 
11544611.4%
 
2132779.8%
 
31356510.0%
 
4126379.3%
 
ValueCountFrequency (%) 
251< 0.1%
 
241< 0.1%
 
236< 0.1%
 
225< 0.1%
 
217< 0.1%
 

study_num
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.0 MiB
0
112610
1
23007
ValueCountFrequency (%) 
011261083.0%
 
12300717.0%
 

coupon
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct count33
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.15896974568085123
Minimum0
Maximum112
Zeros122104
Zeros (%)90.0%
Memory size1.0 MiB
2021-05-24T12:30:31.805228image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum112
Range112
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9146963991
Coefficient of variation (CV)5.753902387
Kurtosis6206.353908
Mean0.1589697457
Median Absolute Deviation (MAD)0
Skewness58.57520335
Sum21559
Variance0.8366695025
2021-05-24T12:30:31.931786image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012210490.0%
 
195637.1%
 
224371.8%
 
37880.6%
 
43400.3%
 
51660.1%
 
6780.1%
 
739< 0.1%
 
922< 0.1%
 
821< 0.1%
 
Other values (23)59< 0.1%
 
ValueCountFrequency (%) 
012210490.0%
 
195637.1%
 
224371.8%
 
37880.6%
 
43400.3%
 
ValueCountFrequency (%) 
1122< 0.1%
 
1081< 0.1%
 
1021< 0.1%
 
541< 0.1%
 
511< 0.1%
 

course_order_num
Real number (ℝ≥0)

ZEROS

Distinct count22
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10562097672120752
Minimum0
Maximum24
Zeros127009
Zeros (%)93.7%
Memory size1.0 MiB
2021-05-24T12:30:32.071212image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum24
Range24
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.5528032321
Coefficient of variation (CV)5.233839426
Kurtosis237.5985816
Mean0.1056209767
Median Absolute Deviation (MAD)0
Skewness11.54324799
Sum14324
Variance0.3055914135
2021-05-24T12:30:32.211265image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
012700993.7%
 
158164.3%
 
215781.2%
 
35800.4%
 
42890.2%
 
51360.1%
 
6820.1%
 
733< 0.1%
 
828< 0.1%
 
913< 0.1%
 
Other values (12)53< 0.1%
 
ValueCountFrequency (%) 
012700993.7%
 
158164.3%
 
215781.2%
 
35800.4%
 
42890.2%
 
ValueCountFrequency (%) 
241< 0.1%
 
231< 0.1%
 
201< 0.1%
 
191< 0.1%
 
182< 0.1%
 

Interactions

2021-05-24T12:29:55.559968image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:55.902288image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:56.165430image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:56.461468image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:56.769971image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:57.060225image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:57.333535image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:57.706318image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:57.975763image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:58.235349image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:58.480280image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:58.795273image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:59.060137image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:59.294588image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:59.540325image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:29:59.782329image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:00.026411image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:00.275005image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:00.530416image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:00.770115image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:01.040065image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:01.290372image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:01.539978image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:01.795390image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:02.031246image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:02.285645image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:02.531697image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:02.772889image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:03.029945image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:03.325247image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:03.581348image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:03.820234image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:04.055140image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:04.320289image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:04.564938image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:04.811393image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:05.205059image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:05.480320image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:05.750030image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:06.022620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:06.280128image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:06.545028image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:06.791394image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:07.050162image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:07.301877image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:07.551678image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:07.831057image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:08.100399image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:08.370015image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:08.630382image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:08.885066image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:09.131471image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:09.370905image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:09.601565image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:09.835369image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:10.141511image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:10.401178image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:10.661995image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:10.925266image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:11.195109image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:11.458026image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:11.705428image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:11.960252image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:12.230371image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:12.625351image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:12.862012image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:13.110064image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:13.360173image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:13.610331image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:13.851540image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:14.109980image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:14.381081image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:14.640367image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:14.896432image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:15.130341image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:15.386979image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:15.682127image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:15.974785image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:16.275360image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:16.559981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:16.867095image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:17.175672image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:17.449476image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:17.755226image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:18.001708image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:18.337191image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:18.637897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:18.909980image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:19.264154image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:19.523825image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:19.830392image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:20.112164image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:20.389986image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:20.793877image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:21.040283image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:21.294896image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:21.551024image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:21.784970image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:22.015171image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:22.270246image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:22.509935image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:22.740393image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:22.981297image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:23.219277image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:23.440524image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:23.711216image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:23.955116image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:24.191010image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:24.410153image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:24.625327image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:24.872030image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:25.110000image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:25.341398image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:25.590259image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:25.852239image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:26.120297image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:26.370260image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:26.621087image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:26.853733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:27.085235image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:27.311455image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-05-24T12:30:32.395229image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-24T12:30:32.964039image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-24T12:30:33.451199image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-24T12:30:33.845028image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-05-24T12:30:27.880287image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-24T12:30:28.510328image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

user_idlogin_daylogin_diff_timedistance_daylogin_timelaunch_timechinese_subscribe_nummath_subscribe_numadd_friendadd_groupcamp_numlearn_numfinish_numstudy_numcouponcourse_order_num
0200000155594528076.86131111011000004
1200000155664522841.0081311111210000
2200000155804780410.00179301011200000
3200000155814646761.00322430011155001
4200000155814687841.753613900111200100
5200000155814737141.25466821011211100
6200000155904523310.00182800011100000
7200000155924592041.50363800011100000
8200000155924603741.75368610011111000
9200000155924782541.75223100011266000

Last rows

user_idlogin_daylogin_diff_timedistance_daylogin_timelaunch_timechinese_subscribe_nummath_subscribe_numadd_friendadd_groupcamp_numlearn_numfinish_numstudy_numcouponcourse_order_num
135607200000294731629610.0020100011100000
135608200000294731693210.00200011100000
135609200000294731707110.00201011100000
135610200000294731748310.00100011100000
135611200000294731762110.00700011110000
135612200000294731772610.00200011100000
135613200000294731775810.00200011100000
1356142000002947317827-1-1.0-1000011100000
135615200000294731794110.0039300011100000
135616200000294801477910.00400011200000